Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

New API for downloading LDML files from projects that allow sharing WS data #1309

Merged
merged 15 commits into from
Jan 17, 2025

Conversation

rmunn
Copy link
Contributor

@rmunn rmunn commented Dec 9, 2024

Fixes #1306.

To test locally, you'll need to get a project with addToSldr="true" in its writing system configuration. None of our default test projects have that, but you can add that with the following method:

  • kubectl --context=docker-desktop exec -it deploy/hg -- bash
  • Inside the container, cd /var/hg/repos/s/sena-3 or /e/elawa-dev-flex
  • hg co tip to populate the repo's working directory
  • edit the CachedSettings/SharedSettings/LexiconSettings.plsx file
  • change the <WritingSystems> line to <WritingSystems addToSldr="true">
  • hg commit -m "Add to SLDR"
    • You might have to run hg config --edit to set your username
  • hg co null to empty the working directory now that the commit is in the repo
  • Exit kubectl, go to browser
  • Go to http://localhost/api/project/getLdmlZip

If all went well, the zipfile should contain a folder named for a project ID at the root. If you get a 22-byte zip file that's empty, then something went wrong.

Currently the directory structure for each project looks like this:

{project ID}/CachedSettings/WritingSystemStore/lots of .ldml files

Do we want to change that to just this?

{project ID}/lots of .ldml files

Discussion items

  • What should the API endpoint be called?
  • Should it be GET or POST? We don't want GoogleBot or other crawlers archiving the link (hopefully the AdminRequired attribute will prevent that).
  • What permissions should be use? AdminRequired, or something else?
  • Flatten directory structure by removing CachedSettings/WritingSystemStore/?

@rmunn rmunn self-assigned this Dec 9, 2024
hgweb/command-runner.sh Outdated Show resolved Hide resolved
@rmunn
Copy link
Contributor Author

rmunn commented Dec 9, 2024

Unfortunately it looks as though we won't be able to use ZipArchive to add files to the output stream asynchronously as they come in: we'd probably run into this issue where ZipArchive tries to write the end-of-zipfile directory synchronously when it's disposed, but Kestrel doesn't allow synchronous writes to its HTTP output stream. So we might have to extract everything, then send the zip file separately. Which means that the HTTP connection might time out, so it's possible we'll have to make two API endpoints, one to prepare the zip file and one to download it. Testing required.

@rmunn rmunn linked an issue Jan 14, 2025 that may be closed by this pull request
Will return 403 Forbidden if project does not allow sharing ws data with
SLDR. Will also return same 403 Forbidden error code if project does not
exist, to avoid possibly leaking project codes.

If project exists and allows data sharing, command will return a zipfile
containing CachedSettings/WritingSystems/*.ldml from the tip revision.
@rmunn rmunn force-pushed the feat/api-for-ldml-zips branch from 8888014 to ae60963 Compare January 15, 2025 16:09
@rmunn
Copy link
Contributor Author

rmunn commented Jan 15, 2025

Rebased on top of current develop since it's been a month since this was started, and a lot has changed on the develop branch in that time.

rmunn added 2 commits January 15, 2025 14:11
Alas, this fails (when a ProjectController method is added to call
PrepareLdmlZip) with "System.InvalidOperationException: Synchronous
operations are disallowed. Call WriteAsync or set AllowSynchronousIO to
true instead."

We'll have to change this to prepare the entire zip file first, then
send it.
This is less efficient, but the only method that actually works given
the current state of the ZipFile / ZipArchive code in .NET.
Copy link

github-actions bot commented Jan 15, 2025

C# Unit Tests

104 tests  ±0   104 ✅ ±0   5s ⏱️ ±0s
 16 suites ±0     0 💤 ±0 
  1 files   ±0     0 ❌ ±0 

Results for commit e0e8bc1. ± Comparison against base commit bb51178.

♻️ This comment has been updated with latest results.

@rmunn
Copy link
Contributor Author

rmunn commented Jan 15, 2025

Now working; marking as ready for review. Not ready to merge yet, as there are several discussion items (marked with TODO comments) in this PR, such as what the permission structure should be. But it's tested and working.

To test locally, you'll need to get a project with addToSldr="true" in its writing system configuration. None of our default test projects have that, but you can add that with the following method:

  • kubectl --context=docker-desktop exec -it deploy/hg -- bash
  • Inside the container, cd /var/hg/repos/s/sena-3 or /e/elawa-dev-flex
  • hg co tip to populate the repo's working directory
  • edit the CachedSettings/SharedSettings/LexiconSettings.plsx file
  • change the <WritingSystems> line to <WritingSystems addToSldr="true">
  • hg commit -m "Add to SLDR"
    • You might have to run hg config --edit to set your username
  • hg co null to empty the working directory now that the commit is in the repo
  • Exit kubectl, go to browser
  • Go to http://localhost/api/project/getLdmlZip

If all went well, the zipfile should contain a folder named for a project ID at the root. If you get a 22-byte zip file that's empty, then something went wrong.

@rmunn rmunn requested a review from hahn-kev January 15, 2025 20:15
@rmunn rmunn marked this pull request as ready for review January 15, 2025 20:15
Copy link
Collaborator

@hahn-kev hahn-kev left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

left a bunch of feedback, let me know if I missed any questions you still have.

One thought is to return 404 if there's no projects found, that should make it clear what's going on rather than an empty zip being an expected result.

backend/LexBoxApi/Jobs/DelayedLexJob.cs Outdated Show resolved Hide resolved
backend/LexBoxApi/Jobs/DelayedLexJob.cs Outdated Show resolved Hide resolved
backend/LexBoxApi/Jobs/DelayedLexJob.cs Outdated Show resolved Hide resolved
backend/LexBoxApi/Services/HgService.cs Outdated Show resolved Hide resolved
backend/LexBoxApi/Services/ProjectService.cs Outdated Show resolved Hide resolved
backend/LexBoxApi/Jobs/DeleteTempDirectoryJob.cs Outdated Show resolved Hide resolved
backend/LexBoxApi/Controllers/ProjectController.cs Outdated Show resolved Hide resolved
backend/LexBoxApi/Controllers/ProjectController.cs Outdated Show resolved Hide resolved
backend/LexBoxApi/Controllers/ProjectController.cs Outdated Show resolved Hide resolved
@rmunn
Copy link
Contributor Author

rmunn commented Jan 16, 2025

@hahn-kev wrote:

left a bunch of feedback, let me know if I missed any questions you still have.

One thought is to return 404 if there's no projects found, that should make it clear what's going on rather than an empty zip being an expected result.

All comments addressed, and commit e7b98d8 sets it up to return 404 instead of an empty zip, as well as putting a UTC timestamp into the filename so that it's more obvious when the API call was run.

This is ready for re-review.

@rmunn rmunn requested a review from hahn-kev January 16, 2025 19:19
Copy link
Collaborator

@hahn-kev hahn-kev left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks good to me, one minor nitpick but feel free to merge this in

backend/LexBoxApi/Services/ProjectService.cs Outdated Show resolved Hide resolved
@rmunn rmunn merged commit c0dc0bf into develop Jan 17, 2025
14 checks passed
@rmunn rmunn deleted the feat/api-for-ldml-zips branch January 17, 2025 14:46
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

New API to export ldml files for projects which have "Share ws data" set
2 participants